Method and apparatus for generating privacy profiles

ABSTRACT

A privacy processing system may use privacy rules to filter sensitive personal information from web session data. The privacy processing system may generate privacy profiles or privacy metadata that identifies how often the privacy rules are called, how often the privacy rules successfully complete actions, and the processing time required to execute the privacy rules. The privacy profiles may be used to detect irregularities in the privacy filtering process that may be associated with a variety of privacy filtering and web session problems.

RELATED APPLICATIONS

The present application is a continuation of U.S. patent applicationSer. No. 13/658,155, filed Oct. 23, 2012, which is herein incorporatedby reference in its entirety.

BACKGROUND

Monitoring and replay systems may capture web session data, such as webpages sent from a web application server to a client computer and userinterface events entered into the web pages at the client computer. Thecaptured web session data may be used to replay and analyze userexperiences during the web sessions. For example, the replayed websessions may be used to identify problems users may be have whilenavigating through web pages during the web session.

Sensitive personal information may be entered into the web pages duringthe web sessions. For example, the web sessions may involve on-linepurchases of products and/or services. In order to complete the on-linetransactions, users may need to enter social security numbers,passwords, credit card numbers, bank account numbers, healthinformation, stock information, home addresses, or the like, or anycombination thereof.

Government privacy regulations may prohibit the retention of certainpersonal information or limit the retention of the personal informationto certified entities. These privacy regulations may require monitoringand replay systems to filter sensitive personal information beforestoring the captured web session data in a database for subsequentreplay analysis.

Current monitoring and replay systems attempt to remove sensitivepersonal information. However, some personal information may not besuccessfully filtered from the captured web session data. For example, aweb application may change the name of a web page or the name of a fieldin the web page that was previously used for triggering the privacyrules that filter the sensitive personal information. If the sensitivepersonal information is not filtered, some or all of the captured websession data may need to be destroyed.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts an example of a system for filtering information fromcaptured web session data.

FIG. 2 depicts an example of a privacy processing system.

FIG. 3 depicts examples of privacy metrics generated by the privacyprocessing system.

FIG. 4 depicts an example process for comparing privacy metrics withprivacy profiles.

FIG. 5 depicts an example process for generating privacy metrics.

FIG. 6 depicts an example process for generating privacy profiles.

FIG. 7 depicts an example of a process for detecting irregular privacymetrics.

FIGS. 8A and 8B depict example graphs displaying average execution timesfor privacy rules.

FIGS. 9A and 9B depict example graphs displaying percentage ofsuccessful completions for privacy rules.

FIGS. 10A, 10B and 10C depict examples of replayed web sessions showingcorrect and incorrect privacy filtering.

FIG. 11 depicts an example computing device for implementing the privacyprocessing system.

DETAILED DESCRIPTION

FIG. 1 depicts a web session 100 conducted between a web application 104operating on a web server 102 and a computing device 110. Webapplication 104 may support any type of online web session such asonline purchases, online financial or medical services, socialnetworking, etc. Of course, these are just examples, and any type ofelectronic web based transaction or activity may be performed using webapplication 104.

Computing device 110 may comprise a Personal Computer (PC), laptopcomputer, wireless Personal Digital Assistant (PDA), cellular telephone,smart phone, tablet, or any other wired or wireless device that accessesand exchanges information with web application 104. Any number ofcomputing devices 110 may conduct different web sessions 100 with webapplication 104 at any geographical location and at any time of day.

Computing device 110 may communicate with web application 104 over anetwork connection 108. Network connection 108 may comprise anycombination of connections over an Internet network, a wireless network,a WiFi network, a telephone network, a Public Services Telephone Network(PSTN), a cellular network, a cable network, a Wide Area Network (WAN),a Local Area Network (LAN), or the like, or any combination thereof.

In one example, a web browser or web application 118 operating oncomputing device 110 may send Hyper Text Transfer Protocol (HTTP)requests to web application 104 over network connection 108. Webapplication 104 may send back one or more of web pages 106 in responseto the HTTP requests and computing device 110 may display the web pagesvia the web browser or application 118 on a computer screen 116. Forexample, web browser or mobile application 118 may display an electronicweb page 112 that contains fields 114A-114C for entering a user name,password, and social security number, respectively. Web application 104may send additional web pages 106 and/or responses to computing device110 in response to the information entered into fields 114.

Web session monitors 122 may capture web session data 124 during websession 100. Web session data 124 may include network data transferredover network connection 108 between computing device 110 and webapplication 104 and user interface events generated on computing device110. For example, web session data 124 may comprise the Hyper TextTransfer Protocol (HTTP) requests and other data requests sent fromcomputing device 110 to web application 104 and the Hyper Text MarkupLanguage (HTML) web pages 106 and other responses sent back to computingdevice 110 from web application 104.

Some of web session data 124 may include user interface events enteredby a user into computing device 110, such as mouse clicks, keystrokes,alpha-numeric data, or the like, or any combination thereof. Forexample, some of the user interface events may comprise data enteredinto fields 114 of web page 112 or may comprise selections of icons orlinks on web page 112.

Other web session data 124 may comprise webpage logic/code sent by webapplication 104 along with web pages 106 to computing device 110 thatfurther determine the different states or operations in the web pages.Some of the web session data may be generated locally on processingdevice 110 and never sent over network connection 108. For example, thecontrol logic within web page 112 may change the state of web page 112in response to user inputs without sending any data back to webapplication 104. In another example, a batch data transfer of onlycompleted information in web page 112 may be transferred back to webapplication 104 over network connection 108.

In another example, some web session data 124 may comprise documentobject model (DOM) events within the web pages. For example, changes inthe DOM of displayed web page 106 may be captured by UI event monitor122A as some of web session data 124. In yet another example, websession data 124 may comprise operating parameters or any other loggeddata in computing device 110 and/or web server 102. For example, websession data 124 may comprise network bandwidth indicators, processorbandwidth indicators, network condition indicators, computer operatingconditions, or the like, or any combination thereof.

In one example, network session monitor 122B may capture the networkdata, such as web pages, requests, responses, and/or logic exchangedbetween computing device 110 and web application 104 over networkconnection 108. User interface (UI) monitor 122A may capture the userinterface events generated locally at computing device 110. In anotherexample, UI monitor 122A also may capture some or all of the networkdata exchanged between computing device 110 and web application 104 overnetwork connection 108.

In yet another example, UI event monitor 122A and/or network sessionmonitor 122B may not capture all the web session data 124 and may onlydetect occurrences of some web session events. In this example, monitors122A and 122B may send unique identifiers identifying occurrences ofcertain web session events and may send timestamps indicating when theweb session events were detected.

Examples of systems for capturing and/or identifying web session dataand events are described in U.S. Pat. No. 6,286,030 issued Sep. 4, 2001,entitled: Systems and Methods for Recording and Visually RecreatingSessions in a Client-Server Environment now reissued as U.S. Pat. No.RE41903; U.S. Pat. No. 8,127,000 issued Feb. 28, 2012, entitled: Methodand Apparatus for Monitoring and Synchronizing User Interface Eventswith Network Data; and U.S. patent application Ser. No. 13/419,179 filedMar. 13, 2012, entitled: Method and Apparatus for Intelligent Capture ofDocument Object Model Events which are all herein incorporated byreference in their entireties.

During web session 100, a user may enter a user name into field 114A, apassword into field 114B and/or a social security number into field114C. Due to the security requirements discussed above, the passwordand/or social security number may need to be filtered before capturedweb session data 124 can be stored in a database 136.

A privacy processing system 130 filters sensitive personal information,such as the password and/or social security number from captured websession data 124. Filtering refers to any combination of removing,blocking, replacing, encrypting, hashing, or the like, information inweb session data 124. Privacy processing system 130 stores filtered websession data 138 in web session database 136. A replay system 134 canthen use the captured and now filtered web session data 138 to replaythe original web session 100 without displaying sensitive personalinformation. One example of a replay system 134 is described in U.S.Pat. No. 8,127,000, entitled: METHOD AND APPARATUS FOR MONITORING ANDSYNCHRONIZING USER INTERFACE EVENTS WITH NETWORK DATA, issued Feb. 28,2012 which has been incorporated by reference in its entirety.

Privacy processing system 130 may apply privacy rules to the capturedweb session data 124 to remove the sensitive personal information.Privacy profiles or privacy metadata may be generated for the privacyrules. For example, the privacy profiles may identify how often theprivacy rules are called, how often the privacy rules successfullycomplete actions, and amounts of processing time required to execute theprivacy rules. The privacy profiles may detect privacy filteringproblems, such as privacy rules that do not filter personal informationor filter the wrong information from the web session data, or if certainpatterns of data require abnormally large privacy resources, such astime or CPU usage. In addition, any big deviation in privacy resourceusage may indicate a change to the website or in end user behavior.

Filtering and/or encrypting sensitive personal information in capturedweb session data may be computationally expensive. For example, aparticular web site may service hundreds of millions of users andhundreds of millions of associated web sessions each day. The privacyprofiles may identify privacy rules that are written incorrectly orinefficiently and waste processing bandwidth. The identified privacyrules can be identified and rewritten so privacy rules can moreefficiently search and filter personal information from the millions ofweb sessions.

Privacy processing system 130 may detect other web session events orstates that may impact user experiences during web sessions 100. Forexample, the privacy profiles may identify incorrect user actions, virusattacks, errors or gaps in the logic of the source web application, etc.Thus, privacy processing system 130 not only generates quantitativeprivacy filtering metrics but may also identify other general problemswith the web sessions.

FIG. 2 depicts in more detail one example of privacy processing system130. A privacy rule parser 148 may apply privacy rules 150 to websession data 124 captured from web sessions 100. Different rules 150 maybe applied to different web pages and different data captured during websession 100. For example, a first rule 150A may search for a particularweb page in captured web session data 124 that may contain a socialsecurity number. A second rule 150B may search for different fields inseveral different web pages in web session data 124 that may contain acredit card number.

Rules 150A and 150B may filter the web session data by replacing,blocking, hashing, encrypting, etc. sensitive personal information, suchas social security numbers or credit card numbers. Filtered web sessiondata 138 is stored in database 136 and can then be used for subsequentreplay and analysis by replay system 134.

A privacy profiler 152 may generate privacy profiles 158 for privacyrules 150. For example, privacy profiler 152 may track the number oftimes each rule 150 is called when filtering web session data 124, thenumber of times each rule 150 succeeded in filtering information in websession data 124, and/or the amount of time required to execute rules150 while filtering web session data 124.

Privacy profiler 152 may generate privacy profiles 158 by aggregatingthe privacy metrics for rules 150. For example, privacy profiler 152 maycalculate an average number of times over the last five minutes thatrule 150A is called while filtering web session data 124. The aggregatedprivacy metrics may be used as a baseline or “profile” of typical ornormal privacy filtering behavior. Security metrics outside of privacyprofile thresholds may be indicative of privacy filtering problems orother web session problems. For example, there may be a substantialchange in the average number of times a particular privacy rule iscalled for each captured web session. In another example, there may be asubstantial change in the amount of time required to successfullycomplete execution of a privacy rule.

These changes in the privacy metrics may be caused by changes in the webapplication or web pages used by the web application. As mentionedabove, a user may enter personal information into particular fieldswithin particular web pages. Privacy rule parser 148 may only call rule150A when particular web page names or field names are identified incaptured web session data 124. If the enterprise operating the webapplication changes the web page name or field name, privacy rule parser148 may no longer call rule 150A to filter data in the renamed web page.Accordingly, a social security number entered into the renamed web pagemay no longer be filtered from web session data 124 compromising theoverall privacy and security of filtered web session data 138.

In another example, a change in the amount of time required tosuccessfully complete privacy rules 150 also may indicate a privacyfiltering problem. For example, web browsers, web pages, and/or web pagelogic may change or reformat data entered into the web pages. Thechanged or reformatted data may cause privacy rule parser 148 tounintentionally call privacy rules 150 for web pages that do not containpersonal information or may cause rules 150 to filter the wrong data.These incorrect or unintentional filtering operations may wasteprocessing bandwidth and/or remove web session data needed foraccurately replaying the captured web sessions.

In yet another example, a change in the privacy metrics may indicate anabnormal web session. For example, a denial of service attack or otherbot attacks may substantially change the number or percentage of times aparticular rule 150 is called or the amount of time required tosuccessfully complete the privacy rule. Privacy profiles 158 canidentify privacy metric changes and isolate privacy filtering or websession problems.

A privacy event processor 154 may display graphs 162 on a computingdevice 160 for privacy profiles 158. For example, graphs 162 mayidentify the average privacy metric values for different privacy rules.Any substantial deviations in graphs 162 may indicate privacy filteringproblems and/or web session problems. A user may direct privacy eventprocessor 154 to display the privacy profiles for different dimensions,such as for particular privacy rules, privacy rule parameters, websession categories, or web browsers. For example, the user may directprivacy event processor 154 to display privacy metrics associated withdifferent desk-top web browsers and mobile applications.

Computing device 160 may use replay system 134 to further isolateprivacy filtering irregularities. For example, a user may comparereplayed filtered web sessions prior to large privacy metric changeswith replayed web sessions after the large privacy metric changes. Anyfiltering problems identified during replay can then be corrected bymodifying the associated rules 150 in privacy rule parser 148.

FIG. 3 depicts in more detail examples of privacy rules 150 and privacymetrics 176 used and generated by the privacy rule parser. A first rule150A may comprise a test 172A that looks for a particular set ofnumbers, text, values, parameters, etc. For example, test 172A may testfor a first set of three numbers separated from a second set of twonumbers by a space or dash. Test 172A may then look for a third set offour numbers separated by a second space or dash from the second set oftwo numbers.

Upon detection of the number sequence, rule 150A may trigger an action174A. In one example, the action 174A for rule 150A may replace thedetected sequence of numbers with an “X”. Examples of other actions mayinclude only “X-ing” out some of the numbers in the identified sequenceor using a hash algorithm to encrypt the number sequence. In oneexample, the encrypted number sequence might not be decrypted, but couldstill be used with the hash algorithm to confirm association of thenumber with a particular user. Any other actions may be used forfiltering information.

A set of privacy metrics 176A may be generated by the privacy ruleparser in association with privacy rule 150A. For example, privacymetrics 176 may include web session identifiers identifying theparticular web sessions where rule 150A is called. A web page identifiermay identify particular web pages in the web session data where rule150A is called. A field identifier may identify particular fields in thecaptured web session data where rule 150A is called for filtering dataentered into the fields.

A web browser identifier may identify a particular web browser orapplication used during the associated web session where rule 150A wascalled. A time stamp metric may identify when the web sessions producingthe captured web session data took place and/or may identify when rule150A was called for filtering the captured web session data. A “# ofcalls” metric may identify the number of times rule 150A was calledwhile filtering the captured web session data or the number of timesrule 150A was called for individual web sessions, web page, etc. A “# ofcompleted actions” metric may identify the number of times rule 150Asuccessfully completed an associated filtering action. For example, the# of actions may identify the number of times rule 150A replaced thesequence in test 172A with the X's in action 174A. An execution timemetric may identify the amount of processing time required to completerule 150A for a web session, web page, etc.

A second rule 150B may comprise a test 172B that also parses the websession data for a particular set of numbers, text, values, parameters,etc. In this example, test 172B may test for a sequence of numbersassociated with a credit card number. The number sequence may be foursets of four numbers separated by spaces or dashes. In one example,satisfaction of test 172B may initiate an associated action 174B thatreplaces the first twelve numbers of the detected sequence of sixteennumbers with “X's”. Another similar set of privacy metrics 176B may begenerated for rule 150B.

Rules 150A and 150B are just examples of any variety of tests andactions that may be applied to the web session data. Rules 150 may beapplied to any combination of web session data. For example, rules 150may only be called for particular web pages that query a user forsensitive personal information, such as a social security number. Otherrules 150 may be called to filter data associated with other sensitivedata captured on other web pages.

FIG. 4 depicts an example of privacy profiles 158 generated by privacyprofiler 152 in FIG. 3 from the privacy metrics. Privacy profiles 158may identify different statistical dimensions associated with filteringthe web session data. For example, a first privacy profile 158A mayidentify the average number of times any of the rules in the privacyrule parser were called while filtering the web session data. Privacyprofile 158A may be derived for different categories, such as for timeperiods, web sessions, web pages, web browsers and/or mobile webapplications.

A privacy profile 158B may identify the average successful completionrate for the rules indicating the average percentage of times theprivacy rules successfully filter information. A privacy profile 158Cmay identify an average amount of time required for all of the rules tofilter the web session data for web sessions, web pages, browsers, etc.

A privacy profile 158D may identify the average number of times aparticular rule #1 is called while filtering the web session data. Aprivacy profile 158E may identify the successful completion rate forrule #1. Privacy profiles 158 may be associated with any other privacyrules and collected privacy dimensions.

The privacy profiler may generate aggregated average privacy metrics 180for aggregation periods. For example, privacy profiler may aggregate thenumber of times different rules are called, aggregate the percentage oftimes the rules successfully complete filtering actions, and aggregatethe amounts of time required for the rules to complete execution. Theprivacy metrics may be aggregated for some selectable time period, suchas five minutes, and the aggregated values averaged to generate privacymetrics 180.

Privacy event processor 154 may compare privacy metrics 180 for eachaggregated time period with privacy profiles 158. A privacy metricnotification 182 may be generated for any privacy metrics 180 that areoutside of threshold ranges for privacy profiles 158. For example, theprivacy profiler may determine standard deviations for the values inprivacy profiles 158. Privacy event processor 154 may send notification182 or create an entry in a log file for any of the privacy metrics 180that are outside of the standard deviations for associated privacyprofiles 158.

For example, rule #1 may have successfully completed 100% of the timeover the last five minute aggregation period. The average successfulcompletion rate in privacy profile 158E for rule #1 may be 80 percentand the standard deviation may be +/−4%. Thus, the threshold range forprivacy profile 158E may be between 76% and 84%. Since the successfulcompletion rate for rule #1 in privacy metrics 180 is outside of thethreshold range for privacy profile 158E, privacy event processor 154may generate a notification 182 or generate an entry in a file or tableidentifying privacy metric 180 for rule #1 as an outlier.

Privacy event processor 154 also may automatically determine differencesin the web session data associated with different privacy metrics. Forexample, the captured web session data may include Document ObjectModels (DOMs) for the web pages. Privacy event processor 154 may detectprivacy metrics outside of the privacy profile thresholds. The DOMs forfiltered web pages having privacy metrics within the privacy profilethresholds may be compared with the DOMs for filtered web pages withprivacy metrics outside of the privacy profile thresholds. The DOMdifferences may be identified and sent to the operator. For example, theDOM differences may identify a web page with a changed name that mayhave prevented rule #1 from correctly triggering.

FIG. 5 depicts an example process for generating privacy metrics. Inoperation 200, privacy rules are applied to captured web session datareceived from the web session monitoring systems. As mentioned above anynumber of web sessions may be continuously, periodically, or randomlymonitored and the associated web session data sent to the privacyprocessing system.

Privacy rules called or triggered during the privacy filtering processare identified in operation 202. For example, a rule may be called whena particular web page is identified in the web session data and the rulemay not be called when that web page is never opened during the capturedweb session. In operation 204, privacy metrics are generated for theprivacy rules. For example, each web session and web page may have anassociated identifier. The privacy rule parser may identify the websessions and/or web pages where the rules are triggered. The privacyrule parser may identify any other privacy metrics associated with therules, such a time or day, when the rules are triggered, a type of webbrowser used on client site where the rule was triggered, etc.

In operation 206, the privacy rule parser may determine if the rules aresuccessfully completed. For example, the privacy rules may be triggeredwhenever a particular web page is identified. The triggered rule maythen execute a test to identify any data in the web page satisfying aparticular condition and/or matching a particular value, sequence,location, etc. If the test is satisfied, the rule performs an associatedaction. For example, the action may replace a matching combination ofnumbers with X's. Replacement of the matching combination of numbers isidentified as a successful completion of the rule.

In operation 208, the amounts of time required to complete the privacyrules may be identified. For example, privacy rules may need to apply anassociated test to all of the data associated with one or more webpages. The privacy rule parser may track the amount of processing timerequired to parse through the one or more web pages. The time can betracked on any other variety of dimensions or categories, such as forperiods of time, web sessions, web pages, particular fields in the webpages, etc. In operation 210, the privacy metrics are sent to theprivacy profiler.

FIG. 6 depicts an example process for generating privacy profiles. Theprivacy profiler receives the privacy metrics from the privacy ruleparser in operation 230. The privacy metrics are aggregated in operation232 for a selectable time period. For example, the privacy profiler maycount the total number of times a particular rule is triggered during anaggregation period. The aggregation period may be any selectable timeperiod, such as seconds, minutes, hours, days, etc. The privacy metricsmay be aggregated for different dimensions, such as for all privacyrules, individual privacy rules, privacy rule calls, privacy rulecompletions, privacy rule execution times, web sessions, web page, etc.

Operation 234 determines when the aggregation period has completed. Forexample, the aggregation period may be five minutes and the privacyprofiler may count the number of times each privacy rule is triggerduring the five minute aggregation period.

In operation 236, averages may be calculated for certain privacymetrics. For example, the privacy profiler may calculate the averagenumber of times the privacy rules are triggered for each web session,the average completion rate for the privacy rules, and the averageexecution times for the privacy rules. Operation 238 stores theaggregated averaged privacy metrics in a database as the privacyprofiles.

FIG. 7 depicts an example process for automatically identifyingirregularities in the privacy filtering process or irregularities in thecaptured web sessions. In operation 250, the privacy event processor mayreceive new privacy metrics from the privacy profiler. For example, thenew privacy metrics may be associated with the last five minutes ofprivacy filtering by the privacy rule parser.

In operation 252, the new privacy metrics may be compared withpreviously generated privacy profiles. For example, as described above,the average execution time for a particular privacy rule over the lastfive minutes may be compared with the average execution times identifiedwith the rule in the privacy profiles. The privacy event processor maysend a notification in operation 256 for any of the most recent privacymetrics that extend outside of a threshold range of the privacy profilesin operation 254. For example, an email may be sent to a system operatoror an entry may be added to a log file identifying the particular rule,associated privacy metrics, and any other associated web sessioninformation, such as time, web session, web page, etc.

In operation 258, the new privacy metrics may be added to the existingprivacy profiles. For example, the privacy profiles may track theaverage execution times of the rules over entire days, weeks, months,years. The new privacy metrics may identify the next time period for theprivacy profile. In one example, the new privacy metrics may be furtheraccumulated with other accumulated and averaged privacy metrics in theprivacy profiles. For example, all of the privacy metrics for a lasthour may be accumulated and averaged to generate one reference point ina day, week, or month long privacy profile.

FIGS. 8A and 8B depict examples of privacy metrics displayed fordifferent privacy rules. Selection boxes 302 may be used for selectingdifferent privacy metrics or privacy dimensions for displaying on anelectronic page 300. A selection box 302A may select a parameter ordimension for displaying on the vertical Y-axis and a selection box 302Bmay select a parameter or dimension for displaying along a horizontalX-axis.

For example, selection box 302A may select the vertical Y-axis torepresent the average execution time required for the privacy processingsystem to complete execution of different privacy rules for captured websession data. Selection box 302B may select the horizontal X-axis torepresent a particular time period for displaying the average executiontimes, such as for a particular day. FIG. 8A shows a range for theaverage execution time on the Y-axis of between 0.0 milliseconds (ms)and 5.0 ms and a time range on the X-axis of between 8:00 am and 3:00pm. Of course other privacy dimensions and time ranges may be displayedon the X and Y axes.

A selection box 302C may select the privacy rules for displayingassociated privacy metrics. For example, selection box 302C may selectall rules #1, #2, and #3 for displaying associated average executiontimes. A selection box 302D may select a web session category fordisplaying associated security metrics. For example, selection box 302Dmay select privacy metrics to be displayed for captured web sessions,captured web pages within the captured web sessions, etc.

Based on the entries in selection boxes 302, the privacy processingsystem displays three lines 304, 306, and 308 representing changes inthe average execution times over a particular day for rules #1, #2, and#3, respectively. In this example, line 304 stays relatively constant ataround 4.5 ms and line 306 stays relatively constant at around 3.5 ms.Normal variations may be expected in the average execution times due todifferent user activities during the web sessions. For example, usersmay navigate through different web pages during the web sessions and mayor may not complete transactions during those web sessions. Thus,different types and amounts of aggregated data may be captured fordifferent individual web sessions that may or may not trigger theexecution of certain privacy rules and may vary the amounts of timerequired to execute the privacy rules.

Line 308 shows a substantial change in the average execution time forrule #3 sometime after 11:00 am. Up until around 11:00 am the averageexecution time is around 2.5 ms and after 11:00 am the average executiontime drops to around 1.0 ms. The change in the average execution timemay indicate a problem with rule #3. For example, the web applicationmay have changed a web page name or field name that was previously usedfor triggering rule #3. As a result, rule #3 may no longer be called forthe renamed web page and personal information in the renamed web pagemay no longer be filtered by rule #3.

Line 308 identifies a potential filtering problem associated with rule#3. An operator may replay some of the web sessions captured after 11:00am to determine if rule #3 is correctly filtering personal informationfrom the captured web session data. For example, the operator maydetermine if rule #3 is removing a social security number from aparticular web page in the replayed web session data.

FIG. 8B depicts another example of privacy metrics displayed by theprivacy processing system. The operator may decide to investigate inmore detail the change in the average execution time for rule #3. Eithervia entries in selection boxes 302 or by selecting line 308, the privacyprocessing system may display bar graphs showing other privacy metricsfor rule #3 associated with different web pages within the web sessions.For example, the operator may select rule #3 in selection box 302C andselect a web page category in selection box 302D.

In response, the privacy processing system may display different bargraphs 320, 322, and 324 each associated with a different web page thatmay be have been displayed during the captured web sessions. Forexample, bar graph 320 may be associated with a log-in page for the websession where a user logs into a web account, bar graph 322 may beassociated with an accounts web page where a user enters addressinformation, and bar graph 324 may be associated with a checkout pagewhere the user completes a transaction for purchasing a product orservice.

A first solid line bar graph may represent the average execution timefor rule #3 at 11:00 am and a second dashed line bar graph may representthe average execution time for rule #3 at 12:00 pm. Bar graph 320 showsthat the average execution time for privacy rule #3 for the log-in webpage did not change substantially from 11:00 am and 12:00 pm and bargraph 322 shows that the average execution time for privacy rule #3 forthe accounts web page did not change substantially from 11:00 am to12:00 pm. However, bar graph 324 shows that the average execution timefor rule #3 when applied to the checkout web page substantiallydecreased from around 2.5 ms at 11:00 am to around 0.5 ms at 12:00 pm.

The operator may use the replay system or may use other search softwareto then determine if rule #3 is correctly filtering personal informationcaptured by the checkout web page. For example, by replaying some of theweb sessions captured after 12:00 pm, the operator may determine thatrule #3 is not filtering credit card numbers from the captured websession data. This would provide an early warning to a breach ofprivacy.

FIGS. 9A and 9B depict another example of how the privacy processingsystem may display privacy metrics that identify irregularities in theprivacy filtering process. In this example, selection box 302A selectsthe vertical Y-axis to represent the successful completion rate fordifferent rules. As explained above, the percentage completion rate mayindicate the percentage of times a particular rule was called ortriggered and then successfully completed an associated action, such asreplacing or encrypting a sequence of numbers.

FIG. 9A shows successful completion percentages between 60% and 100%.Selection box 302B selects the time period for displaying the successfulcompletion rates between 7:00 am and 1:00 pm. Selection box 302C selectsall rules #1, #2, and #3 for displaying associated completion rates andselection box 302D selects web sessions as the category of web sessiondata for displaying associated completion rates.

Based on the entries in selection boxes 302, the privacy processingsystem displays three lines 340, 342, and 344 showing the changes over aparticular day for the completion rates for rules #1, #2, and #3,respectively. In this example, line 340 shows that privacy rule #1 staysrelatively constant at around 90% and line 342 shows that privacy rule#2 stays relatively constant at around 80%.

Variations in the completion rate also may be expected due to thedifferent user activities during the web sessions. Again, users maynavigate though different web pages during the web sessions and may ormay not complete transactions during those web sessions. For example,some users may enter credit card information into web pages during websessions that later during privacy filtering may trigger certain privacyrules and allow the triggered privacy rules to complete associatedactions. Users in other web sessions may never enter credit card numbersinto web pages and thus prevent some privacy rules from completing theirassociated actions.

Line 344 shows a substantial increase in the completion rate for privacyrule #3 sometime between 9:00 am and 10:00 am. Up until around 9:30 amthe completion rate for privacy rule #3 is around 60% and after 10:00 amthe completion rate for rule #3 increases to over 80%. The increase inthe completion rate may indicate a problem with privacy rule #3. Forexample, modifications to a web page may cause rule #3 tounintentionally replace all of the data entered into the web page. As aresult, privacy rule #3 may remove web session data needed for properlyreplaying and analyzing captured web sessions.

Thus, line 344 identifies a potential filtering problem associated withprivacy rule #3. The operator may again replay some web sessionscaptured after 10:00 am to determine if rule #3 is filtering the correctinformation from the captured web session data.

FIG. 9B depicts other privacy metrics displayed by the privacyprocessing system. The operator may decide to investigate the change inthe completion rate for privacy rule #3. Either via entries in selectionboxes 302 or by selecting line 344, the privacy processing system maydisplay bar graphs showing security metrics for different web browsersused during the web sessions. For example, the operator may selectprivacy rule #3 in selection box 302C and select browsers in selectionbox 302D.

In response, the privacy processing system may display different bargraphs 350, 352, and 354 each associated with a different web browser orweb application that may have been used during the captured websessions. For example, bar graph 350 may be associated with a mobile webbrowser or mobile application used on mobile devices, bar graph 352 maybe associated with a desk-top web browser used on a personal computer,and bar graph 354 may be associated with a second type of desk-top webbrowser used on personal computers.

A first solid line bar graph may represent the completion rates forprivacy rule #3 at 9:00 am and a second dashed line bar graph mayrepresent the completion rates for privacy rule #3 at 12:00 pm. Bargraphs 352 and 354 show that the completion rates associated with thetwo desktop browsers did not change substantially between 9:00 am and10:00 pm. This may indicate that the web sessions conducted with the twodesk top browsers and the privacy filtering associated with the browsersare both operating normally. However, bar graph 350 shows that thecompletion rate for privacy rule #3 for captured web sessions associatedwith the mobile browser increased substantially from around 60% at 9:00am to around 85% at 10:00 am.

The operator may again use the replay system or other software to verifyprivacy rule #3 is filtering the correct information in the captured websessions. For example, replaying some of the 10:00 am mobile browser websessions may determine privacy rule #3 is filtering the wrong data. Thetest algorithm may interpret rule #3 differently for data originatingfrom a mobile web browser, causing the data to be handled differently.

FIGS. 10A-10C depict examples of how the replay system may confirmproper privacy filtering of captured web session data. FIG. 10A depictsa mobile device 370 having a screen displaying an electronic web page374 used to enter personal information for completing a credit cardtransaction. In this example, a user may enter a name into a name field372A, enter a street address into an address field 372B, enter a townand zip code into a city field 372C, and enter a credit card number intoa credit card field 372D. As explained above, a monitoring system maycapture and send the personal information entered into fields 372 to theprivacy processing system.

FIG. 10B shows a replayed web session after the captured web sessiondata was filtered by the privacy processing system. The web session maybe replayed on computing device 160 and may replay electronic web page374. The capture and filtering of the web session data may have happenedaround 9:00 am. FIG. 10B may represent a properly filtered web sessionwhere only the first eight digits of the credit card number werereplaced with X.

FIG. 10C shows a second replayed web session after the captured websession data was filtered by the privacy processing system. Thefiltering of the captured web session data may have happened around10:00 am. FIG. 10C may represent incorrectly filtered web session datawhere all of the information captured in electronic web page 374 isreplaced with X's. Thus, the replay system can be used to furtherinvestigate and identify privacy filtering problems that may haveoriginally been identified by comparing privacy metrics with the privacyprofiles.

Hardware and Software

FIG. 11 shows a computing device 1000 that may be used for operating theprivacy processing system and performing any combination of the privacyprocessing operations discussed above. The computing device 1000 mayoperate in the capacity of a server or a client machine in aserver-client network environment, or as a peer machine in apeer-to-peer (or distributed) network environment. In other examples,computing device 1000 may be a personal computer (PC), a tablet, aPersonal Digital Assistant (PDA), a cellular telephone, a smart phone, aweb appliance, or any other machine or device capable of executinginstructions 1006 (sequential or otherwise) that specify actions to betaken by that machine.

While only a single computing device 1000 is shown, the computing device1000 may include any collection of devices or circuitry thatindividually or jointly execute a set (or multiple sets) of instructionsto perform any one or more of the operations discussed above. Computingdevice 1000 may be part of an integrated control system or systemmanager, or may be provided as a portable electronic device configuredto interface with a networked system either locally or remotely viawireless transmission.

Processors 1004 may comprise a central processing unit (CPU), a graphicsprocessing unit (GPU), programmable logic devices, dedicated processorsystems, micro controllers, or microprocessors that may perform some orall of the operations described above. Processors 1004 may also include,but may not be limited to, an analog processor, a digital processor, amicroprocessor, multi-core processor, processor array, networkprocessor, etc.

Some of the operations described above may be implemented in softwareand other operations may be implemented in hardware. One or more of theoperations, processes, or methods described herein may be performed byan apparatus, device, or system similar to those as described herein andwith reference to the illustrated figures.

Processors 1004 may execute instructions or “code” 1006 stored in anyone of memories 1008, 1010, or 1020. The memories may store data aswell. Instructions 1006 and data can also be transmitted or receivedover a network 1014 via a network interface device 1012 utilizing anyone of a number of well-known transfer protocols.

Memories 1008, 1010, and 1020 may be integrated together with processingdevice 1000, for example RAM or FLASH memory disposed within anintegrated circuit microprocessor or the like. In other examples, thememory may comprise an independent device, such as an external diskdrive, storage array, or any other storage devices used in databasesystems. The memory and processing devices may be operatively coupledtogether, or in communication with each other, for example by an I/Oport, network connection, etc. such that the processing device may reada file stored on the memory.

Some memory may be “read only” by design (ROM) by virtue of permissionsettings, or not. Other examples of memory may include, but may be notlimited to, WORM, EPROM, EEPROM, FLASH, etc. which may be implemented insolid state semiconductor devices. Other memories may comprise movingparts, such a conventional rotating disk drive. All such memories may be“machine-readable” in that they may be readable by a processing device.

“Computer-readable storage medium” (or alternatively, “machine-readablestorage medium”) may include all of the foregoing types of memory, aswell as new technologies that may arise in the future, as long as theymay be capable of storing digital information in the nature of acomputer program or other data, at least temporarily, in such a mannerthat the stored information may be “read” by an appropriate processingdevice. The term “computer-readable” may not be limited to thehistorical usage of “computer” to imply a complete mainframe,mini-computer, desktop, wireless device, or even a laptop computer.Rather, “computer-readable” may comprise storage medium that may bereadable by a processor, processing device, or any computing system.Such media may be any available media that may be locally and/orremotely accessible by a computer or processor, and may include volatileand non-volatile media, and removable and non-removable media.

Computing device 1000 can further include a video display 1016, such asa liquid crystal display (LCD) or a cathode ray tube (CRT)) and a userinterface 1018, such as a keyboard, mouse, touch screen, etc. All of thecomponents of computing device 1000 may be connected together via a bus1002 and/or network.

For the sake of convenience, operations may be described as variousinterconnected or coupled functional blocks or diagrams. However, theremay be cases where these functional blocks or diagrams may beequivalently aggregated into a single logic device, program or operationwith unclear boundaries.

Having described and illustrated the principles of a preferredembodiment, it should be apparent that the embodiments may be modifiedin arrangement and detail without departing from such principles. Claimis made to all modifications and variation coming within the spirit andscope of the following claims.

1-20. (canceled)
 21. A privacy processing system comprising: a set ofprivacy rules, wherein each of the privacy rules is for application infiltering sensitive personal information from web session data; a set ofprivacy profiles, wherein each of the privacy profiles includes metricsassociated with application of one or more of the privacy rules to websession data; and a privacy event processor for: using one or more ofthe privacy profiles, identifying an irregularity in application of oneor more of the privacy rules in filtering sensitive personal informationfrom web session data associated with a web session; and uponidentification of the irregularity, sending an electronic notificationrelating to the identified irregularity.
 22. The system of claim 21,wherein the web session includes an online purchase of a product orservice, and wherein the irregularity is associated with a filteringproblem that includes a privacy rule that does not filter sensitivepersonal information comprising information associated with completingthe online purchase.
 23. The system of claim 21, wherein the web sessionincludes a credit card purchase, and wherein the irregularity isassociated with a filtering problem that includes a privacy rule thatdoes not filter credit card information associated with the purchase.24. The system of claim 21, comprising a privacy profiler for generatingthe privacy profiles.
 25. The system of claim 21, wherein the privacyprofiles identify how often the privacy rules are called, how often theprivacy rules successfully complete actions, and amounts of processingtime required to execute the privacy rules.
 26. The system of claim 21,wherein application of the privacy rules in filtering sensitive personalinformation from web session data comprises using the privacy rules toremove sensitive personal information.
 27. The system of claim 21,wherein the metrics are privacy metrics including aggregated metricsrelating to privacy rule use over time.
 28. The system of claim 21,wherein the metrics are privacy metrics that include aggregated metricsrelating to privacy rule use over time, and wherein comparison ofaggregated metrics is used in identifying the irregularity inapplication of one or more of the privacy rules in filtering sensitivepersonal information from web session data.
 29. The system of claim 21,wherein the metrics are privacy metrics that include aggregated metricsrelating to privacy rule use over time, and wherein comparison ofaggregated metrics is used in identifying the irregularity inapplication of one or more of the privacy rules in filtering sensitivepersonal information from web session data, and wherein identifying theirregularity comprises, for a first privacy rule: determining a firststatistical metric associated with use of the first privacy rule infiltering sensitive personal information from web session data over afirst period of time; determining a second statistical metric associatedwith use of the first privacy rule in filtering sensitive personalinformation from web session data over a second period of time; andcomparing the first statistical metric to the second statistical metricin determining whether a threshold deviation has been reached betweenthe first statistical metric and the second statistical metric, whereinreaching the threshold deviation is associated with identifying theirregularity.
 30. The system of claim 21, wherein the privacy rules arefor application in filtering sensitive personal information from the websession data before storing web session data, of the web session data,in a database for subsequent replay analysis.
 31. The system of claim21, wherein the privacy processing system is for replaying the websession with the sensitive personal information filtered therefrom inidentifying the irregularity.
 32. The system of claim 21, wherein theirregularity is associated with a filtering problem.
 33. The system ofclaim 21, wherein the irregularity is associated with a filteringproblem that includes a privacy rule that does not filter sensitivepersonal information from web session data.
 34. The system of claim 21,wherein the irregularity is associated with a filtering problem thatincludes a privacy rule that does not filter sensitive personalinformation from web session data as required by one or more governmentprivacy regulations.
 35. The system of claim 21, wherein thenotification is sent to be received or accessed by an operator of atleast a portion of the privacy processing system.
 36. The system ofclaim 21, wherein the notification comprises an email.
 37. The system ofclaim 21, wherein the notification comprises a log entry accessible byan operator of at least a portion of the privacy processing system. 38.A method comprising: generating a set of privacy rules for applicationin filtering sensitive personal information from web session data;generating a set of privacy profiles, wherein each of the privacyprofiles includes metrics associated with application of one or more ofthe privacy rules to web session data; using one or more of the privacyprofiles, identifying an irregularity in application of one or more ofthe privacy rules in filtering sensitive personal information from websession data associated with a web session; and upon identification ofthe irregularity, sending an electronic notification relating to theidentified irregularity.
 39. The method of claim 38, comprisingidentifying the irregularity, wherein the web session includes an onlinepurchase of a product or service, and wherein the irregularity isassociated with a filtering problem that includes a privacy rule thatdoes not filter sensitive personal information comprising informationassociated with completing the online purchase.
 40. The method of claim38, comprising identifying the irregularity, wherein the web sessionincludes a credit card transaction, and wherein the irregularity isassociated with a filtering problem that includes a privacy rule thatdoes not filter credit card information associated with completing thetransaction.
 41. The method of claim 38, wherein the irregularity isdetermined to be associated with Document Object Model (DOM) changesmade to filtered web pages.
 42. The method of claim 38, wherein themetrics are privacy metrics that include aggregated metrics relating toprivacy rule use over time, and wherein comparison of aggregated metricsis used in identifying the irregularity in application of one or more ofthe privacy rules in filtering sensitive personal information from websession data.
 43. The method of claim 38, wherein the metrics areprivacy metrics that include aggregated metrics relating to privacy ruleuse over time, and wherein comparison of aggregated metrics is used inidentifying the irregularity in application of one or more of theprivacy rules in filtering sensitive personal information from websession data, and wherein identifying the irregularity comprises, for afirst privacy rule: determining a first statistical metric associatedwith use of the first privacy rule in filtering sensitive personalinformation from web session data over a first period of time;determining a second statistical metric associated with use of the firstprivacy rule in filtering sensitive personal information from websession data over a second period of time; and comparing the firststatistical metric to the second statistical metric in determiningwhether a threshold deviation has been reached between the firststatistical metric and the second statistical metric, wherein reachingthe threshold deviation is associated with identifying the irregularity.44. A non-transitory computer readable medium or media containinginstructions for executing a method comprising: generating a set ofprivacy rules for application in filtering sensitive personalinformation from web session data; generating a set of privacy profiles,wherein each of the privacy profiles includes metrics associated withapplication of one or more of the privacy rules to web session data;using one or more of the privacy profiles, identifying an irregularityin application of one or more of the privacy rules in filteringsensitive personal information from web session data associated with aweb session; and upon identification of the irregularity, sending anelectronic notification of the identified irregularity; wherein the websession includes an online purchase of a product or service, and whereinthe irregularity is associated with a filtering problem that includes aprivacy rule that does not filter sensitive personal informationcomprising information associated with completing the online purchase;and wherein the metrics are privacy metrics that include aggregatedmetrics relating to privacy rule use over time, and wherein comparisonof aggregated metrics is used in identifying the irregularity inapplication of one or more of the privacy rules in filtering sensitivepersonal information from web session data, and wherein identifying theirregularity comprises, for a first privacy rule of the privacy rules:determining a first statistical metric associated with use of the firstprivacy rule in filtering sensitive personal information from websession data over a first period of time; determining a secondstatistical metric associated with use of the first privacy rule infiltering sensitive personal information from web session data over asecond period of time; and comparing the first statistical metric to thesecond statistical metric in determining whether a threshold deviationhas been reached between the first statistical metric and the secondstatistical metric, wherein reaching the threshold deviation isassociated with identifying the irregularity.